feat(rest): add scan plan endpoints to REST catalog client#614
Conversation
4fde2a5 to
ab6c8d4
Compare
17f12b3 to
86714b9
Compare
|
I've just finished my first round of review. Thanks @gsandeep1241 for working on this and everyone for the review! Let me know what you think. |
gsandeep1241
left a comment
There was a problem hiding this comment.
Thanks @wgtmac for the detailed review. I am part way into addressing your comments. I'm posting the partial comments early because there was one open question I wanted to check - changes to catalog.h because if we go that route, we can simplify the PR.
The rest of your comments I don't have a conflict with and will address them.
| bool use_snapshot_schema = false; | ||
| std::optional<int64_t> start_snapshot_id; | ||
| std::optional<int64_t> end_snapshot_id; | ||
| std::vector<std::string> statsFields; |
| std::optional<int64_t> start_snapshot_id; | ||
| std::optional<int64_t> end_snapshot_id; | ||
| std::vector<std::string> statsFields; | ||
| std::optional<int64_t> min_rows_required; |
| /// \param table The table to scan. | ||
| /// \param context The scan context containing snapshot, filter, and other options. | ||
| /// \return A PlanTableScanResponse with the plan status and initial scan tasks. | ||
| virtual Result<rest::PlanTableScanResponse> PlanTableScan( |
There was a problem hiding this comment.
I debated this and I felt that a catalog handler should really honor a Request-Response contract. (That said, I'm not doing the "Request" part as such). The java implementation I felt handled core logic directly coupled with the handlers.
But if we want to keep it similar, it makes sense to split into two PRs to prevent this from getting complicated. I'll remove the stuff in catalog.h and rest_catalog.h.
Agree?
|
Is this ready to review? I see there are CI failures and unaddressed comments. Please also rebase it to the latest main branch to make sure namespace separator is respected. |
5f7c8f8 to
d388f81
Compare
|
@wgtmac Addressed all comments (except one, where I left a question for you). Please take a look when you get the chance. Thank you! |
- Add PlanTableScan, FetchPlanningResult, CancelPlanning, FetchScanTasks methods to RestCatalog, wiring them to the corresponding REST endpoints - Add scan plan endpoint definitions and resource path helpers - Add ScanPlanErrorHandler with proper 404/406 error dispatch (NoSuchPlanIdException, NoSuchPlanTaskException, etc.) - Add PlanTableScanRequest/FetchScanTasksRequest serialization helpers - Add kNoSuchPlanId and kNoSuchPlanTask ErrorKind values - Add DataFileFromJson and FileScanTasksFromJson overloads with partition spec and schema support - Add endpoint and integration tests for all 4 scan plan operations
- Flatten BaseScanTaskResponse hierarchy into three independent structs - Add error payload to PlanTableScanResponse and FetchPlanningResultResponse for failed status - Key delete file index by file_path instead of pointer identity - Add partition serialization to ToJson(DataFile) - Update tests for all changes
c2235ce to
3961bd7
Compare
|
I've pushed a commit to tighten the REST scan JSON serde to match Java/spec behavior more closely. This includes preserving residual filters, distinguishing missing vs empty scan task fields, requiring contextual DataFile serialization/deserialization with partition spec/schema validation, adding request-side FromJson support for scan requests, and simplifying delete-file deduplication by reusing DeleteFileSet. I also adjusted the tests to cover these parity cases in the existing style. Thanks @gsandeep1241 for working on this! |
3961bd7 to
e884cb4
Compare
Uh oh!
There was an error while loading. Please reload this page.